What is High Availability?

In this lesson, you will learn about high availability and its importance in online services.

Highly available computing infrastructure is the norm in the computing industry today. When it comes to cloud platforms, it is their key feature that enables the workloads running on them to be highly available.

This lesson is an insight into high availability. It covers all the frequently asked questions about it, including:

  • What is it?
  • Why is it so important to businesses?
  • What is a highly available cluster?
  • How do cloud platforms ensure the high availability of the services running on them?
  • What are fault tolerance and redundancy? How are they related to high availability?

So, without any further ado. Let’s get on with it.

What is high availability?#

High availability, also known as HA, is the ability of the system to stay online despite having failures at the infrastructural level in real-time.

High availability ensures the service’s uptime is much more than usual. It improves the reliability of the system and ensures minimum downtime.

The sole mission of highly available systems is to stay online and stay connected. A rudimentary real-world example of staying highly available is having backup generators to ensure continuous power supply in case of any power outages in hospitals, air traffic control towers and so on.

In the industry, HA is often expressed as a percentage. For instance, when the system is 99.99999% highly available, it simply means 99.99999% of the total hosting time the service will be up. You might often see this in the Service Level Agreements (SLA) of cloud platforms.

How important is high availability to online services?#

Social applications going down for a bit and then bouncing back might not impact businesses that much. However, there are mission-critical systems like aircraft systems, spacecrafts, mining machines, hospital servers, and finance stock-market systems that cannot afford to go down at any time. After all, lives depend on it.

The smooth functioning of the mission-critical systems relies on continual connectivity with their network/servers. These are the instances that do not work without super highly available infrastructures.

Besides, no service likes to go down, critical or not. To meet the high availability requirements, systems are designed to be fault-tolerant and their components are made redundant.

What are fault-tolerant and redundancy in distributed systems? I will discuss this up next.

Scalability Quiz
Reasons For System Failures
Mark as Completed
Report an Issue